Predicting Translation Equivalents in Linked WordNets

نویسندگان

  • Krasimir Angelov
  • Gleb Lobanov
چکیده

We present an algorithm for predicting translation equivalents between two languages, based on the corresponding WordNets. The assumption is that all synsets of one of the languages are linked to the corresponding synsets in the other language. In theory, given the exact sense of a word in a context it must be possible to translate it as any of the words in the linked synset. In practice, however, this does not work well since automatic and accurate sense disambiguation is difficult. Instead it is possible to define a more robust translation relation between the lexemes of the two languages. As far as we know the Finnish WordNet is the only one that includes that relation. Our algorithm can be used to predict the relation for other languages as well. This is useful for instance in hybrid machine translation systems which are usually more dependent on high-quality translation dictionaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Coping with Lexical Gaps when Building Aligned Multilingual Wordnets

In this paper we present a methodology for automatically classifying the translation equivalents of a machine readable bilingual dictionary in three main groups: lexical units, lexical gaps (that is cases when a lexical concept of a language does not have a correspondent in the other language) and translation equivalents that need to be manually classified as lexical units or lexical gaps. This...

متن کامل

Fine-Grained Word Sense Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and Aligned Wordnets

The paper presents a method for word sense disambiguation based on parallel corpora. The method exploits recent advances in word alignment and word clustering based on automatic extraction of translation equivalents and being supported by available aligned wordnets for the languages in the corpus. The wordnets are aligned to the Princeton Wordnet, according to the principles established by Euro...

متن کامل

Cross-Lingual Sentiment Analysis for Indian Languages using Linked WordNets

Cross-Lingual Sentiment Analysis (CLSA) is the task of predicting the polarity of the opinion expressed in a text in a language Ltest using a classifier trained on the corpus of another language Lt rain. Popular approaches use Machine Translation (MT) to convert the test document in Ltest to Lt rain and use the classifier of Lt rain. However, MT systems do not exist for most pairs of languages ...

متن کامل

IndoWordNet

India is a multilingual country where machine translation and cross lingual search are highly relevant problems. These problems require large resourceslike wordnets and lexiconsof high quality and coverage. Wordnets are lexical structures composed of synsets and semantic relations. Synsets are sets of synonyms. They are linked by semantic relations like hypernymy (is-a), meronymy (part-of), tro...

متن کامل

Multilingual Word Sense Disambiguation Using Aligned Wordnets

Word Sense Disambiguation (WSD from now on) represents an established task within Natural Language Processing community, aiming at finding the right sense of a word occurring in a free running text through the use of a computer algorithm. Currently, most of the WSD approaches consider only monolingual texts, and, as such, they rely mainly on the discriminatory power of the words appearing in th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016